随着深度神经网络(DNN)的发展以解决日益复杂的问题,它们正受到现有数字处理器的延迟和功耗的限制。为了提高速度和能源效率,已经提出了专门的模拟光学和电子硬件,但是可扩展性有限(输入矢量长度$ k $的数百个元素)。在这里,我们提出了一个可扩展的,单层模拟光学处理器,该光学处理器使用自由空间光学器件可重新配置输入向量和集成的光电,用于静态,可更新的加权和非线性 - 具有$ k \ \ 1,000 $和大约1,000美元和超过。我们通过实验测试MNIST手写数字数据集的分类精度,在没有数据预处理或在硬件上进行数据重新处理的情况下达到94.7%(地面真相96.3%)。我们还确定吞吐量($ \ sim $ 0.9 examac/s)的基本上限,由最大光带宽设置,然后大大增加误差。我们在兼容CMOS兼容系统中宽光谱和空间带宽的组合可以实现下一代DNN的高效计算。
translated by 谷歌翻译
与小组元素的作用一样,在数学中通常用于分析或利用给定问题设置中固有的对称性。在这里,我们提供有效的量子算法,用于对存储为量子状态的数据进行线性组卷积和互相关。我们的算法的运行时间在组的维度上是对数,因此与经典算法相比,当输入数据作为量子状态和线性操作提供良好的条件时,提供了指数加速。我们的理论框架是出于解决代数问题的量子算法的丰富文献,为量化机器学习和采用小组操作的数值方法中的许多算法开辟了一条途径。
translated by 谷歌翻译
Increasingly taking place in online spaces, modern political conversations are typically perceived to be unproductively affirming -- siloed in so called ``echo chambers'' of exclusively like-minded discussants. Yet, to date we lack sufficient means to measure viewpoint diversity in conversations. To this end, in this paper, we operationalize two viewpoint metrics proposed for recommender systems and adapt them to the context of social media conversations. This is the first study to apply these two metrics (Representation and Fragmentation) to real world data and to consider the implications for online conversations specifically. We apply these measures to two topics -- daylight savings time (DST), which serves as a control, and the more politically polarized topic of immigration. We find that the diversity scores for both Fragmentation and Representation are lower for immigration than for DST. Further, we find that while pro-immigrant views receive consistent pushback on the platform, anti-immigrant views largely operate within echo chambers. We observe less severe yet similar patterns for DST. Taken together, Representation and Fragmentation paint a meaningful and important new picture of viewpoint diversity.
translated by 谷歌翻译
Electricity prices in liberalized markets are determined by the supply and demand for electric power, which are in turn driven by various external influences that vary strongly in time. In perfect competition, the merit order principle describes that dispatchable power plants enter the market in the order of their marginal costs to meet the residual load, i.e. the difference of load and renewable generation. Many market models implement this principle to predict electricity prices but typically require certain assumptions and simplifications. In this article, we present an explainable machine learning model for the prices on the German day-ahead market, which substantially outperforms a benchmark model based on the merit order principle. Our model is designed for the ex-post analysis of prices and thus builds on various external features. Using Shapley Additive exPlanation (SHAP) values, we can disentangle the role of the different features and quantify their importance from empiric data. Load, wind and solar generation are most important, as expected, but wind power appears to affect prices stronger than solar power does. Fuel prices also rank highly and show nontrivial dependencies, including strong interactions with other features revealed by a SHAP interaction analysis. Large generation ramps are correlated with high prices, again with strong feature interactions, due to the limited flexibility of nuclear and lignite plants. Our results further contribute to model development by providing quantitative insights directly from data.
translated by 谷歌翻译
Data-driven modeling has become a key building block in computational science and engineering. However, data that are available in science and engineering are typically scarce, often polluted with noise and affected by measurement errors and other perturbations, which makes learning the dynamics of systems challenging. In this work, we propose to combine data-driven modeling via operator inference with the dynamic training via roll outs of neural ordinary differential equations. Operator inference with roll outs inherits interpretability, scalability, and structure preservation of traditional operator inference while leveraging the dynamic training via roll outs over multiple time steps to increase stability and robustness for learning from low-quality and noisy data. Numerical experiments with data describing shallow water waves and surface quasi-geostrophic dynamics demonstrate that operator inference with roll outs provides predictive models from training trajectories even if data are sampled sparsely in time and polluted with noise of up to 10%.
translated by 谷歌翻译
Brain-inspired computing proposes a set of algorithmic principles that hold promise for advancing artificial intelligence. They endow systems with self learning capabilities, efficient energy usage, and high storage capacity. A core concept that lies at the heart of brain computation is sequence learning and prediction. This form of computation is essential for almost all our daily tasks such as movement generation, perception, and language. Understanding how the brain performs such a computation is not only important to advance neuroscience but also to pave the way to new technological brain-inspired applications. A previously developed spiking neural network implementation of sequence prediction and recall learns complex, high-order sequences in an unsupervised manner by local, biologically inspired plasticity rules. An emerging type of hardware that holds promise for efficiently running this type of algorithm is neuromorphic hardware. It emulates the way the brain processes information and maps neurons and synapses directly into a physical substrate. Memristive devices have been identified as potential synaptic elements in neuromorphic hardware. In particular, redox-induced resistive random access memories (ReRAM) devices stand out at many aspects. They permit scalability, are energy efficient and fast, and can implement biological plasticity rules. In this work, we study the feasibility of using ReRAM devices as a replacement of the biological synapses in the sequence learning model. We implement and simulate the model including the ReRAM plasticity using the neural simulator NEST. We investigate the effect of different device properties on the performance characteristics of the sequence learning model, and demonstrate resilience with respect to different on-off ratios, conductance resolutions, device variability, and synaptic failure.
translated by 谷歌翻译
Accurate and consistent vehicle localization in urban areas is challenging due to the large-scale and complicated environments. In this paper, we propose onlineFGO, a novel time-centric graph-optimization-based localization method that fuses multiple sensor measurements with the continuous-time trajectory representation for vehicle localization tasks. We generalize the graph construction independent of any spatial sensor measurements by creating the states deterministically on time. As the trajectory representation in continuous-time enables querying states at arbitrary times, incoming sensor measurements can be factorized on the graph without requiring state alignment. We integrate different GNSS observations: pseudorange, deltarange, and time-differenced carrier phase (TDCP) to ensure global reference and fuse the relative motion from a LiDAR-odometry to improve the localization consistency while GNSS observations are not available. Experiments on general performance, effects of different factors, and hyper-parameter settings are conducted in a real-world measurement campaign in Aachen city that contains different urban scenarios. Our results show an average 2D error of 0.99m and consistent state estimation in urban scenarios.
translated by 谷歌翻译
This article focuses on the control center of each human body: the brain. We will point out the pivotal role of the cerebral vasculature and how its complex mechanisms may vary between subjects. We then emphasize a specific acute pathological state, i.e., acute ischemic stroke, and show how medical imaging and its analysis can be used to define the treatment. We show how the core-penumbra concept is used in practice using mismatch criteria and how machine learning can be used to make predictions of the final infarct, either via deconvolution or convolutional neural networks.
translated by 谷歌翻译
Pre-trained language models (PLMs) have outperformed other NLP models on a wide range of tasks. Opting for a more thorough understanding of their capabilities and inner workings, researchers have established the extend to which they capture lower-level knowledge like grammaticality, and mid-level semantic knowledge like factual understanding. However, there is still little understanding of their knowledge of higher-level aspects of language. In particular, despite the importance of sociodemographic aspects in shaping our language, the questions of whether, where, and how PLMs encode these aspects, e.g., gender or age, is still unexplored. We address this research gap by probing the sociodemographic knowledge of different single-GPU PLMs on multiple English data sets via traditional classifier probing and information-theoretic minimum description length probing. Our results show that PLMs do encode these sociodemographics, and that this knowledge is sometimes spread across the layers of some of the tested PLMs. We further conduct a multilingual analysis and investigate the effect of supplementary training to further explore to what extent, where, and with what amount of pre-training data the knowledge is encoded. Our overall results indicate that sociodemographic knowledge is still a major challenge for NLP. PLMs require large amounts of pre-training data to acquire the knowledge and models that excel in general language understanding do not seem to own more knowledge about these aspects.
translated by 谷歌翻译
Fairness and environmental impact are important research directions for the sustainable development of artificial intelligence. However, while each topic is an active research area in natural language processing (NLP), there is a surprising lack of research on the interplay between the two fields. This lacuna is highly problematic, since there is increasing evidence that an exclusive focus on fairness can actually hinder environmental sustainability, and vice versa. In this work, we shed light on this crucial intersection in NLP by (1) investigating the efficiency of current fairness approaches through surveying example methods for reducing unfair stereotypical bias from the literature, and (2) evaluating a common technique to reduce energy consumption (and thus environmental impact) of English NLP models, knowledge distillation (KD), for its impact on fairness. In this case study, we evaluate the effect of important KD factors, including layer and dimensionality reduction, with respect to: (a) performance on the distillation task (natural language inference and semantic similarity prediction), and (b) multiple measures and dimensions of stereotypical bias (e.g., gender bias measured via the Word Embedding Association Test). Our results lead us to clarify current assumptions regarding the effect of KD on unfair bias: contrary to other findings, we show that KD can actually decrease model fairness.
translated by 谷歌翻译